Goto

Collaborating Authors

 accessed 29


Vehicle Classification under Extreme Imbalance: A Comparative Study of Ensemble Learning and CNNs

Syarubany, Abu Hanif Muhammad

arXiv.org Artificial Intelligence

We curate a 16 - class corpus (~47k images) by merging Kaggle, ImageNet, and web - cr awled data, and create six balanced variants via SMOTE oversampling and targeted undersampling. Lightweight ensembles, such as Random Forest, AdaBoost, and a soft - voting combiner built on MobileNet - V2 features are benchmarked against a configurable ResNet - style CNN trained with strong augmentation and label smoothing. The best ensemble (SMOTE - combined) attains 74.8% test accuracy, while the CNN achieves 79.19% on the full test set and 81.25% on an unseen inferen ce batch, confirming the advantage of deep models. Nonetheless, the most under - represented class (Barge) remains a failure mode, highlighting the limits of rebalancing alone. Results suggest prioritizing additional minority - class collection and cost - sensit ive objectives (e.g., focal loss) and exploring hybrid ensemble or CNN pipelines to combine interpretability with representational power. The best ensemble (SMOTE - combined) reached 74.8% test accuracy, while the final checkpoint of CNN achieved 79.1 9 % on the full test set and 81. 25 % on an unseen EE531 inference batch, confirming that deep models excel overall but still falter on the most under - represented class ( Barge), underscoring the persistent challenge of extreme imbalance.


Enhancing Retrieval-Augmented Generation for Electric Power Industry Customer Support

Chan, Hei Yu, Ho, Kuok Tou, Ma, Chenglong, Si, Yujing, Lin, Hok Lai, Lam, Sa Lei

arXiv.org Artificial Intelligence

Many AI customer service systems use standard NLP pipelines or finetuned language models, which often fall short on ambiguous, multi-intent, or detail-specific queries. This case study evaluates recent techniques: query rewriting, RAG Fusion, keyword augmentation, intent recognition, and context reranking, for building a robust customer support system in the electric power domain. We compare vector-store and graph-based RAG frameworks, ultimately selecting the graph-based RAG for its superior performance in handling complex queries. We find that query rewriting improves retrieval for queries using non-standard terminology or requiring precise detail. RAG Fusion boosts performance on vague or multifaceted queries by merging multiple retrievals. Reranking reduces hallucinations by filtering irrelevant contexts. Intent recognition supports the decomposition of complex questions into more targeted sub-queries, increasing both relevance and efficiency. In contrast, keyword augmentation negatively impacts results due to biased keyword selection. Our final system combines intent recognition, RAG Fusion, and reranking to handle disambiguation and multi-source queries. Evaluated on both a GPT-4-generated dataset and a real-world electricity provider FAQ dataset, it achieves 97.9% and 89.6% accuracy respectively, substantially outperforming baseline RAG models.


Machine Unlearning using a Multi-GAN based Model

Hatua, Amartya, Nguyen, Trung T., Sung, Andrew H.

arXiv.org Artificial Intelligence

This article presents a new machine unlearning approach that utilizes multiple Generative Adversarial Network (GAN) based models. The proposed method comprises two phases: i) data reorganization in which synthetic data using the GAN model is introduced with inverted class labels of the forget datasets, and ii) fine-tuning the pre-trained model. The GAN models consist of two pairs of generators and discriminators. The generator discriminator pairs generate synthetic data for the retain and forget datasets. Then, a pre-trained model is utilized to get the class labels of the synthetic datasets. The class labels of synthetic and original forget datasets are inverted. Finally, all combined datasets are used to fine-tune the pre-trained model to get the unlearned model. We have performed the experiments on the CIFAR-10 dataset and tested the unlearned models using Membership Inference Attacks (MIA). The inverted class labels procedure and synthetically generated data help to acquire valuable information that enables the model to outperform state-of-the-art models and other standard unlearning classifiers.


OpenAI Codex: Exploration and review of the platform & API

#artificialintelligence

Done in context of a miniproject in the course AI applications at university of applied sciences OST, written by Nick Wallner, Mirio Eggmann The goal of this project is to use and evaluate OpenAI.



The Short Anthropological Guide to the Study of Ethical AI

Royer, Alexandrine

arXiv.org Artificial Intelligence

Over the next few years, society as a whole will need to address what core values it wishes to protect when dealing with technology. Anthropology, a field dedicated to the very notion of what it means to be human, can provide some interesting insights into how to cope and tackle these changes in our Western society and other areas of the world. It can be challenging for social science practitioners to grasp and keep up with the pace of technological innovation, with many being unfamiliar with the jargon of AI. This short guide serves as both an introduction to AI ethics and social science and anthropological perspectives on the development of AI. It intends to provide those unfamiliar with the field with an insight into the societal impact of AI systems and how, in turn, these systems can lead us to rethink how our world operates.


A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI

Tjoa, Erico, Guan, Cuntai

arXiv.org Artificial Intelligence

Recently, artificial intelligence, especially machine learning has demonstrated remarkable performances in many tasks, from image processing to natural language processing, especially with the advent of deep learning. Along with research progress, machine learning has encroached into many different fields and disciplines. Some of them, such as the medical field, require high level of accountability, and thus transparency, which means we need to be able to explain machine decisions, predictions and justify their reliability. This requires greater interpretability, which often means we need to understand the mechanism underlying the algorithms. Unfortunately, the black-box nature of the deep learning is still unresolved, and many machine decisions are still poorly understood. We provide a review on interpretabilities suggested by different research works and categorize them, with the intention of providing alternative perspective that is hopefully more tractable for future adoption of interpretability standard. We explore further into interpretability in the medical field, illustrating the complexity of interpretability issue.